This notebook contains a set of analyses for analyzing mrbananagrabber’s BoardGameGeek collection. The bulk of the analysis is focused on building a user-specific predictive model to predict the games that the specified user is likely to add to their collection.
By analyzing a user’s collection and training a predictive model, I am able to answer questions such as:
What designers/mechanics/genres does a user tend to like or dislike?
What older games might they be interested in adding to their collection?
What new and upcoming games should they check out?
How many games has mrbananagrabber owned/rated/played?
What types of game does mrbananagrabber own? I can look at the most frequent types of categories, mechanics, designers, and artists that appear in a user’s collection.
What games does mrbananagrabber currently have in their collection? The following table can be used to examine games the user owns, along with some helpful information for selecting the right game for a game night!
Use the filters above the table to sort/filter based on information about the game, such as year published, recommended player counts, or playing time.
I’ll now the examine predictive models trained on the user’s collection.
For an individual user, I train a predictive model on their collection in order to predict whether a user owns a game. The outcome, in this case, is binary: does the user have a game listed in their collection or not? This is the setting for training a classification model, where the model aims to learn the probability that a user will add a game to their collection based on its observable features.
How does a model learn what a user is likely to own? The training process is a matter of examining historical games and finding patterns that exist between game features (designers, mechanics, playing time, etc) and games in the user’s collection.
Note: I train models to predict whether a user owns a game based only on information that could be observed about the game at its release: playing time, player count, mechanics, categories, genres, and selected designers, artists, and publishers. I do not make use of BGG community information, such as its average rating or number of user ratings (though I do use a game’s estimated complexity as a feature). This is to ensure the model can predict newly released games and is not dependent on the BGG community to rate them.
A predictive model gives us more than just predictions. We can also ask, what did the model learn from the data? What predicts the outcome? In the case of predicting a boardgame collection, what did the model find to be predictive of games a user owns?
To answer this, I can examine the coefficients from a model logistic regression with ridge regularization (which I will refer to as a penalized logistic regression). Positive values indicate that a feature increases a user’s probability of owning/rating a game, while negative values indicate a feature decreases the probability. To be precise, the coefficients indicate the effect of a particular feature on the log-odds of a user owning a game.
This model examines a wide variety of features of games (506 features, to be exact) and estimates their effect on whether a user owns a game. These estimates are then shrunken towards zero based on a tuning parameter (lambda), where the appropriate value is estimated from the data.
The following visualization shows the path of each feature as it enters the model, with highly influential features tending to enter the model early with large positive or negative effects.
This type of model enables me to I can examine the effects of specific features on a user’s collection. For instance, what is a user’s favorite designer? Least favorite mechanic? The following plots indicate specific effects for different kinds of features.
In addition to training a logistic regression, I trained another type of model using boosted trees (LightGBM), a flexible nonparametric method that is well suited for prediction.
Which features were most used by this model? Features that are important in predicting a user’s collection will appear towards the top of cover, frequency, and/or gain.
How well did the model do in predicting the user’s collection?
This section contains a variety of visualizations and metrics for assessing the performance of the model(s). If you’re not particularly interested in predictive modeling, skip down further to the predictions from the model.
An easy way to examine the performance of classification model is to view a separation plot.
I plot the predicted probabilities from the model for every game (from resampling) from lowest to highest. We then overlay a blue line for any game that the user does own. A good classifier is one that is able to separate the blue (games owned by the user) from the white (games not owned by the user), with most of the blue occurring at the highest probabilities (right side of the chart).
I can more formally assess how well each model did in resampling by looking at the area under the receiver operating characteristic curve (roc_auc). A perfect model would receive a score of 1, while a model that cannot predict the outcome will default to a score of 0.5. The extent to which something is a good score depends on the setting, but generally anything in the .8 to .9 range is very good while the .7 to .8 range is perfectly acceptable.
| type | wflow_id | .metric | mean | std_err | n |
|---|---|---|---|---|---|
| resamples | glmnet | roc_auc | 0.927 | 0.010 | 5 |
| resamples | lightgbm | roc_auc | 0.919 | 0.011 | 5 |
Another way of looking at what the model learned is to see its predictions on the training set. The models are trained on games published before 2021; of these games, what did the model like for the user?
| Top (Older) Games for mrbananagrabber | |||||
| Rankings based on predictive model trained on user's collection using games released through 2021 | |||||
| rank | image | game | description | Pr(Own) | Own |
|---|---|---|---|---|---|
| 1 | Arkham Horror: The Card Game (Revised Edition) (2021) | The boundaries between worlds have drawn perilously thin. Dark forces work in the shadows and call upon unspeakable horrors, strange happenings are discovered all throughout the city of Arkham, Massachusetts, and behind it all an Ancient One manipulates everything from beyond the veil. It is time to revisit that which started it all… With a revamped system of organization and a number of quali... | 0.989 | no | |
| 2 | Brass: Lancashire (2007) | Brass: Lancashire — first published as Brass — is an economic strategy game that tells the story of competing cotton entrepreneurs in Lancashire during the industrial revolution. You must develop, build, and establish your industries and network so that you can capitalize demand for iron, coal and cotton. The game is played over two halves: the canal phase and the rail phase. To win the game, s... | 0.983 | no | |
| 3 | Era: Medieval Age (2019) | Era: Medieval Age serves as the spiritual successor to Roll Through The Ages. While Roll Through The Ages was a pioneer for roll-and-write-style games, Era is a pioneer for roll-and-build! In Era, your dice represent different classes of medieval society as players attempt to build the most prosperous city. The "build" comes into play as players actually build their cities on their boards. You... | 0.977 | no | |
| 4 | Star Wars: Rebellion (2016) | Star Wars: Rebellion is a board game of epic conflict between the Galactic Empire and Rebel Alliance for two to four players. Experience the Galactic Civil War like never before. In Rebellion, you control the entire Galactic Empire or the fledgling Rebel Alliance. You must command starships, account for troop movements, and rally systems to your cause. Given the differences between the Empire ... | 0.942 | no | |
| 5 | Orléans (2014) | During the medieval goings-on around Orléans, you must assemble a following of farmers, merchants, knights, monks, etc. to gain supremacy through trade, construction and science in medieval France. In Orléans, you will recruit followers and put them to work to make use of their abilities. Farmers and Boatmen supply you with money and goods; Knights expand your scope of action and secure your m... | 0.929 | no | |
| 6 | Unmatched: Little Red Riding Hood vs. Beowulf (2020) | In battle, there are no equals. ONCE UPON A TIME, Little Red Riding Hood took her basket of nasty tricks and faced off against the legendary Beowulf in this exciting Unmatched set. "What big eyes you have, Wulfie!" "That’s called 'rage', kid!" Little Red features a clever card-combo mechanism. Matching icons on the cards she plays to the one in her "basket" (discard pile), triggers potent e... | 0.916 | no | |
| 7 | Star Wars: X-Wing (Second Edition) (2018) | X-Wing Second Edition puts you in command of your own squadron of advanced starfighters locked in thrilling, tactical space combat. Following in the footsteps of the first edition, the second edition refines the intuitive and exciting core formula of maneuvering your ships into position by placing a central focus on the visceral thrill of flying starships in the Star Wars galaxy. During a batt... | 0.896 | yes | |
| 8 | Merchants & Marauders (2010) | Merchants & Marauders lets you live the life of an influential merchant or a dreaded pirate in the Caribbean during the Golden Age of Piracy. Seek your fortune through trade, rumor hunting, missions, and of course, plundering. Modify your ship, buy impressive vessels, load deadly special ammunition, and hire specialist crew members. Will your captain gain eternal glory and immense wealth - or ... | 0.853 | yes | |
| 9 | Shadows: Amsterdam (2018) | Amsterdam, present day. A crime has been committed, but the police investigation is going nowhere. An anonymous client has called your detective agency to investigate. However, your rivals are on the case as well, so there’s no time to waste. Your Intelligence Officer will guide your Detectives through the city, by sending pictures to communicate. Each image contains location intel ... for tho... | 0.833 | no | |
| 10 | Pandemic: Iberia (2016) | Welcome to the Iberian Peninsula! Set in 1848, Pandemic Iberia asks you to take on the roles of nurse, railwayman, rural doctor, sailor, and more to find the cures to malaria, typhus, the yellow fever, and cholera. From Barcelona to Lisboa, you will need to travel by carriage, by boat, or by train to help the Iberian populace. While doing so, distributing purified water and developing railways... | 0.807 | no | |
I’ll plot the top 10 games most likely to be owned by the user in the last 10 years of the training set.
Games highlighted in blue are currently in the user’s collection; games highlighted in light blue are games that the user previously owned.
| Top Games by Year for mrbananagrabber | |||||||||||
| Rankings based on predictive model trained on user's collection using games released through 2021 | |||||||||||
| Rank | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | A Few Acres of Snow | Archipelago | Rococo | Orléans | Mombasa | Star Wars: Rebellion | Pandemic Legacy: Season 2 | Star Wars: X-Wing (Second Edition) | Era: Medieval Age | Unmatched: Little Red Riding Hood vs. Beowulf | Arkham Horror: The Card Game (Revised Edition) |
| 2 | The Lord of the Rings: The Card Game | Kemet | BANG! The Dice Game | AquaSphere | XCOM: The Board Game | Pandemic: Iberia | Gaia Project | Shadows: Amsterdam | Crusader Kings | Cosmic Encounter Duel | Unmatched: Battle of Legends, Volume Two |
| 3 | Mage Knight Board Game | Terra Mystica | Room 25 | Akrotiri | El Grande Big Box | Mansions of Madness: Second Edition | Century: Golem Edition | Brass: Birmingham | Azul: Summer Pavilion | Unmatched: Buffy the Vampire Slayer | Explorers |
| 4 | A Game of Thrones: The Board Game (Second Edition) | Clash of Cultures | Impulse | Star Wars: Imperial Assault | Runebound (Third Edition) | Sherlock Holmes Consulting Detective: Jack the Ripper & West End Adventures | Twilight Imperium: Fourth Edition | Concordia Venus | Aftermath | Bites | Bloodborne: The Board Game |
| 5 | Letters from Whitechapel | The Manhattan Project | Lewis & Clark: The Expedition | Sons of Anarchy: Men of Mayhem | T.I.M.E Stories | Terraforming Mars | Azul | Blackout: Hong Kong | Maracaibo | Unmatched: Cobble & Fog | Kemet: Blood and Sand |
| 6 | Gears of War: The Board Game | Space Cadets | The New Science | Istanbul | Mysterium | Agricola (Revised Edition) | Gloomhaven | Newton | Unmatched: Battle of Legends, Volume One | On Mars | Imperium: Classics |
| 7 | The Castles of Burgundy | Rex: Final Days of an Empire | Legacy: The Testament of Duke de Crecy | Patchwork | 7 Wonders Duel | Arkham Horror: The Card Game | Sherlock Holmes Consulting Detective: Carlton House & Queen's Park | Cosmic Encounter: 42nd Anniversary Edition | Barrage | Gloomhaven: Jaws of the Lion | Boonlake |
| 8 | Tragedy Looper | Android: Netrunner | Spyrium | Deception: Murder in Hong Kong | Forbidden Stars | Sushi Go Party! | The Godfather: Corleone's Empire | Rising Sun | Unmatched: Robin Hood vs. Bigfoot | Fallout Shelter: The Board Game | Roll Camera!: The Filmmaking Board Game |
| 9 | Elder Sign | The Great Zimbabwe | Le Fantôme de l'Opéra | Fields of Arle | Blood Rage | New Angeles | Civilization: A New Dawn | Gen7: A Crossroads Game | The Castles of Burgundy | Sherlock Holmes Consulting Detective: The Baker Street Irregulars | Imperium: Legends |
| 10 | Eminent Domain | Il Vecchio | Circus Train (Second Edition) | Hyperborea | Grand Austria Hotel | Avenue | Semper Fidelis: Bitwa o Lwów 1918-1919 | Azul: Stained Glass of Sintra | Wingspan | Praga Caput Regni | CULTivate |
The following table shows the model’s predictions for games in the training set.
What do the model’s predicted probabilties mean? Or, put another way, how well calibrated are the model’s predictions?
If the model assigns a probability of 5%, how often does the outcome actually occur? A well calibrated model is one in which the predicted probabilities reflect the probabilities we would observe in the actual data. We can assess the calibration of a model by grouping its predictions into bins and assessing how often we observe the outcome versus how often each model expects to observe the outcome.
A model that is well calibrated will closely follow the dashed line - its expected probabilities match that of the observed probabilities. A model that consistently underestimates the probability of the event will be over this dashed line, be a while a model that overestimates the probability will be under the dashed line.
I first assessed the models based on their performance via resampling on the training set.
But how well does my modeling approach do in predicting new games? To test this, I assessed the performance of the model (which was trained on games published through 2021) on games published in 2022-2023.
How well did the model do? The following table shows the model’s predictions for games in the validation set.
As before, I can then assess the performance of the model.
| type | wflow_id | .metric | .estimate |
|---|---|---|---|
| valid | glmnet | mn_log_loss | 0.012 |
| valid | lightgbm | mn_log_loss | 0.010 |
| valid | glmnet | roc_auc | 0.919 |
| valid | lightgbm | roc_auc | 0.950 |
What new and upcoming games does the model predict for mrbananagrabber?
The following table displays the top 15 games published after 2021 with the highest probability of entering the user’s collection.
| Top 15 (Newer) Games for mrbananagrabber | |||||
| Rankings based on predictive model trained on user's collection using games released through 2021 | |||||
| rank | image | game | description | Pr(Own) | Own |
|---|---|---|---|---|---|
| 1 | Terminus (2023) | You and your competitors’ transit companies have been hired by the city to build new subway lines and commercial developments to improve the city's bottom line. Manage assets such as time, money, & resources to build your subway line. Gain prestige by completing objectives and fulfilling the city’s transit demands. Focus on individual projects, open Agendas or a little of both in an effort t... | 0.919 | no | |
| 2 | Gloomhaven: Second Edition (2024) | Gloomhaven: Second Edition is a revised and elevated version of the award-winning core game of Gloomhaven. This is the culmination of everything Isaac Childres and the growing Cephalofair Games team have learned since the initial release of Gloomhaven, including feedback from the community, playtesters, co-designers, and developers. The world, story, and challenging gameplay are all still the ... | 0.864 | no | |
| 3 | Galactic Cruise (2024) | Hello, and welcome to Galactic Cruise. Here, we offer our guests something special: the comfort of a luxury cruise with the innovation of space travel. As the first company to offer extended-stay space vacations, we are excited to have you working for us! As a supervisor of this company, you’ll be expected to not only build these shuttles and satisfy our guests, but also to help the company th... | 0.689 | no | |
| 4 | The Witcher: Old World (2023) | In The Witcher: Old World, you become a witcher — a professional monster slayer — and immerse yourself in the legendary universe of The Witcher franchise. Set years before the saga of Geralt of Rivia, The Witcher: Old World explores a time when monsters roamed the Continent in greater numbers, creating a constant peril that required the attention of expertly trained monster slayers, known as w... | 0.571 | no | |
| 5 | Frosthaven (2022) | Frosthaven is the story of a small outpost far to the north of the capital city of White Oak, an outpost barely surviving the harsh weather as well as invasions from forces both known and unknown. There, a group of mercenaries at the end of their rope will help bring back this settlement from the edge of destruction. Not only will they have to deal with the harsh elements, but there are other, ... | 0.550 | yes | |
| 6 | Unmatched Adventures: Tales to Amaze (2023) | Unmatched Adventures: Tales to Amaze, which is themed around the pulp adventures, tall tales, and local legends of the mid-20th century, gives you a whole new way to play Unmatched. In the game, players work together to defeat one of two villains: Mothman or the Martian Invader. Each villain has a unique battlefield with unique objectives. If the villain completes their objective (or defeats t... | 0.469 | no | |
| 7 | Unmatched: Jurassic Park – Dr. Sattler vs. T. Rex (2022) | In battle, there are no equals. "Dinosaurs eat man… Woman inherits the earth." The greatest predator the world has ever known is closing in on the tenacious Dr. Sattler. Who has the slightest idea what to expect? In Unmatched: Jurassic Park – Dr. Sattler vs. T. Rex, the massive T rex unleashes fearsome attacks and seems unstoppable while Dr. Sattler makes full use of her surroundings and the a... | 0.468 | yes | |
| 8 | Star Wars: The Deckbuilding Game (2023) | The Rebel Alliance fights valiantly against the tyranny of the Galactic Empire. Each new victory brings the Rebels hope, and each heroic sacrifice strengthens their resolve. Still, the Empire's resources are vast, and the firepower of its Empire Navy is unmatched. With neither side willing to accept defeat, their war rages across the galaxy... In Star Wars: The Deckbuilding Game, a head-to-hea... | 0.399 | no | |
| 9 | Unmatched: Brains and Brawn (2023) | Unmatched: Brains and Brawn, the fifth and final Unmatched Marvel set, features some of Marvel's hottest heroes: Spider-Man, Dr. Strange, and She-Hulk. Spidey swings around the battlefield, using his spider-sense to keep him safe. Dr. Strange has, well, the best card names in the game: Behold the Seven Suns of Cinnibus! And She-Hulk won't think twice about throwing the book — or whatever heavy ... | 0.397 | no | |
| 10 | Neo-G Racing (Clear Duels Series) (2023) | The year is 2123 AD. Neo-Gravity Racing has become the biggest sport in the world. Get on the track. Hit the boost pads — but watch out for attacks! Do you have what it takes to cross the finish line first? Neo-G Racing (Clear Duels Series) is a futuristic combat racing game for 2 players (competitive) that combines hand management, route building, sealed bidding, racing, layering, and grid co... | 0.386 | no | |
| 11 | Defenders of the Dictionary (2024) | You may notice that your daily comms are filling with slang, shorthand, and emojis, all of which are autocorrected by technology that knows what you're about to say or think. Thankfully there is a task force fighting back against this linguistic corruption. Your task, as a Defender of the Dictionary, is to fight the evil forces threating our language. Your weapon is your collective wisdom, but... | 0.379 | no | |
| 12 | Unmatched: Houdini vs. The Genie (2022) | Unmatched is a highly asymmetrical miniature fighting game for two or four players. Each hero is represented by a unique deck designed to evoke their style and legend. Tactical movement and no-luck combat resolution create a unique play experience that rewards expertise, but just when you've mastered one set, new heroes arrive to provide all new match-ups. Unmatched: Houdini vs. The Genie adds ... | 0.365 | no | |
| 13 | Union Stockyards (2023) | Union Stockyards is a mid-weight economic euro game with unique features: A supply/demand driven market that is central to game play, not a sidebar. Low randomness -- market changes due to player decisions. A worker-placement game where your workers may go on strike. Extensively-researched historical theme about one of the great industrial wonders of U.S. Gilded Age. Ope... | 0.343 | no | |
| 14 | Nebula (2023) | It is said that for every person who has inhabited the Earth, there is more than one shining star in the universe... In Nebula you will have to collect stars to position them in your galaxy. Stars come in various stages, from red giants to white dwarfs, which you can get from star clusters. You can get stars by investing Time or going to Chaos. Then you must position them in some orbit and con... | 0.339 | no | |
| 15 | Unmatched: Sun's Origin (2023) | Unmatched: Sun's Origin spotlights two heroes from the rich history of Japan. Oda Nobunaga was the daimyo of the Oda clan, renowned for unifying feudal Japan. He is a master tactician, making his honor guard even more dangerous (and just so happens to be a powerhouse in Tales To Amaze). Tomoe Gozen was a legendary onna-musha of the Minamoto clan. She strikes hard and fast, relentlessly pursui... | 0.328 | no | |
Why did the model predict these games?
Finally, I can examine predictions for all newer and upcoming games.